dNLS (New, Opera) - Preprocessing QC statistics ¶

July 2025 - Nancy Y¶

Reran by Sagy on Sep 15 (2025) - removing batch3 and CD41 and WT Untreated (and renaming dNLS_new_CLEAN -> dNLS)

In [1]:
import sys
import os

NOVA_HOME = '/home/projects/hornsteinlab/Collaboration/NOVA'
NOVA_DATA_HOME = '/home/projects/hornsteinlab/Collaboration/NOVA'
os.environ['NOVA_HOME'] = NOVA_HOME
sys.path.insert(1, os.getenv("NOVA_HOME"))
print(f"NOVA_HOME: {os.getenv('NOVA_HOME')}")

LOGS_PATH = os.path.join("/home/projects/hornsteinlab/Collaboration/NOVA/outputs/preprocessing/ManuscriptFinalData_80pct/dNLS/logs")
PLOT_PATH = None

root_directory_raw = os.path.join(NOVA_DATA_HOME, 'input', 'images', 'raw', 'OPERA_dNLS_6_batches_NOVA_sorted')
root_directory_proc = os.path.join(NOVA_DATA_HOME, 'input', 'images', 'processed', 'ManuscriptFinalData_80pct', 'dNLS')

import pandas as pd
import numpy as np
import contextlib
import io
from IPython.display import display, Javascript

from tools.preprocessing_tools.qc_reports.qc_utils import log_files_qc, run_validate_folder_structure, display_diff, sample_and_calc_variance, \
                                                show_site_survival_dapi_brenner, show_site_survival_dapi_cellpose, \
                                                show_site_survival_dapi_tiling, show_site_survival_target_brenner, \
                                                calc_total_sums, plot_filtering_heatmap, show_total_sum_tables, \
                                                plot_cell_count, plot_catplot, plot_hm_of_mean_cell_count_per_tile, \
                                                run_calc_hist_new, show_total_valid_tiles_per_marker_and_batch
                                                
from tools.preprocessing_tools.qc_reports.qc_config import dnls_opera_panels, dnls_opera_markers, dnls_opera_marker_info, \
                                                           dnls_opera_cell_lines, \
                                                dnls_opera_cell_lines_to_cond, dnls_opera_cell_lines_for_disp, dnls_opera_reps, \
                                                dnls_opera_line_colors, dnls_opera_lines_order, dnls_opera_custom_palette, \
                                                dnls_opera_expected_dapi_raw, markers, custom_palette,dnls_opera_cell_lines_to_reps

%load_ext autoreload
%autoreload 2
NOVA_HOME: /home/projects/hornsteinlab/Collaboration/NOVA
In [2]:
# choose batches
batches = [f'batch{i}' for i in [1,2,4,5,6]]
batches
Out[2]:
['batch1', 'batch2', 'batch4', 'batch5', 'batch6']
In [3]:
df = log_files_qc(LOGS_PATH, only_wt_cond=False, batches=batches, filename_split='-',site_location=0)
df = df[df.cell_line == 'dNLS']

df_dapi = df[df.marker=='DAPI']
df_target = df[df.marker!='DAPI']
reading logs of batch5
reading logs of batch6
reading logs of batch2
reading logs of batch4
reading logs of batch1

Total of 5 files were read.
Before dup handeling  (264355, 21)
After duplication removal #1: (264355, 22)
After duplication removal #2: (264355, 22)

Actual Files Validation¶

Raw Files Validation¶

  1. How many site tiff files do we have in each folder?
  2. Are all existing files valid? (tif, at least 2049kB, not corrupetd)
In [4]:
raws = run_validate_folder_structure(root_directory_raw, False, dnls_opera_panels, dnls_opera_markers.copy(),PLOT_PATH, dnls_opera_marker_info,
                                    dnls_opera_cell_lines_to_cond, dnls_opera_reps, dnls_opera_cell_lines_for_disp, 
                                    dnls_opera_expected_dapi_raw,
                                     batches=batches, fig_width=2,fig_height=12,cell_lines_to_reps=dnls_opera_cell_lines_to_reps,
                                     expected_count=250, check_antibody=False)
batch1
Folder structure is valid.
No bad files are found.
Total Sites:  60000
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      250            250
G3BP1   rep2      250            250
G3BP1   rep3      250            250
NONO    rep1      250            250
NONO    rep2      250            250
...      ...      ...            ...
TOMM20  rep2      250            250
TOMM20  rep3      250            250
DAPI    rep1     3000           3000
DAPI    rep2     3000           3000
DAPI    rep3     3000           3000

[87 rows x 3 columns]
========
batch2
Folder structure is valid.
No bad files are found.
Total Sites:  60000
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      250            250
G3BP1   rep2      250            250
G3BP1   rep3      250            250
NONO    rep1      250            250
NONO    rep2      250            250
...      ...      ...            ...
TOMM20  rep2      250            250
TOMM20  rep3      250            250
DAPI    rep1     3000           3000
DAPI    rep2     3000           3000
DAPI    rep3     3000           3000

[87 rows x 3 columns]
========
batch4
Folder structure is valid.
No bad files are found.
Total Sites:  60000
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      250            250
G3BP1   rep2      250            250
G3BP1   rep3      250            250
NONO    rep1      250            250
NONO    rep2      250            250
...      ...      ...            ...
TOMM20  rep2      250            250
TOMM20  rep3      250            250
DAPI    rep1     3000           3000
DAPI    rep2     3000           3000
DAPI    rep3     3000           3000

[87 rows x 3 columns]
========
batch5
Folder structure is valid.
No bad files are found.
Total Sites:  59736
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      250            250
G3BP1   rep2      250            250
G3BP1   rep3      250            250
NONO    rep1      250            250
NONO    rep2      250            250
...      ...      ...            ...
TOMM20  rep2      250            250
TOMM20  rep3      250            250
DAPI    rep1     3000           2934
DAPI    rep2     3000           3000
DAPI    rep3     3000           3000

[87 rows x 3 columns]
========
batch6
Folder structure is valid.
No bad files are found.
Total Sites:  59997
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      250            250
G3BP1   rep2      250            250
G3BP1   rep3      250            250
NONO    rep1      250            250
NONO    rep2      250            250
...      ...      ...            ...
TOMM20  rep2      250            250
TOMM20  rep3      250            250
DAPI    rep1     3000           3000
DAPI    rep2     2999           3000
DAPI    rep3     3000           3000

[87 rows x 3 columns]
========
====================

Processed Files Validation¶

  1. How many site npy files do we have in each folder? -> How many sites survived the pre-processing?
  2. Are all existing files valid? (at least 100kB, npy not corrupted)
In [5]:
procs = run_validate_folder_structure(root_directory_proc, True, dnls_opera_panels, dnls_opera_markers,PLOT_PATH,dnls_opera_marker_info,
                                    dnls_opera_cell_lines_to_cond, dnls_opera_reps, dnls_opera_cell_lines_for_disp, dnls_opera_expected_dapi_raw,
                                    fig_width=2,fig_height=12,cell_lines_to_reps=dnls_opera_cell_lines_to_reps,
                                     expected_count=250, check_antibody=False, batches=batches)
batch1
Folder structure is valid.
No bad files are found.
Total Sites:  26841
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      162            104
G3BP1   rep2        8             74
G3BP1   rep3       77              0
NONO    rep1      211            209
NONO    rep2      229            121
...      ...      ...            ...
TOMM20  rep2      104             50
TOMM20  rep3       45              8
DAPI    rep1     1571           1277
DAPI    rep2     1442           1292
DAPI    rep3     1584            861

[87 rows x 3 columns]
========
batch2
Folder structure is valid.
No bad files are found.
Total Sites:  27113
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      126             91
G3BP1   rep2        3            186
G3BP1   rep3      136              9
NONO    rep1      249            181
NONO    rep2      205            174
...      ...      ...            ...
TOMM20  rep2      123              4
TOMM20  rep3       85             25
DAPI    rep1     1618           1229
DAPI    rep2     1342           1500
DAPI    rep3     1720            690

[87 rows x 3 columns]
========
batch4
Folder structure is valid.
No bad files are found.
Total Sites:  47069
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      221             90
G3BP1   rep2      205            171
G3BP1   rep3      222            147
NONO    rep1      246            181
NONO    rep2      246            196
...      ...      ...            ...
TOMM20  rep2      184            198
TOMM20  rep3       83            235
DAPI    rep1     2530           2193
DAPI    rep2     2537           2320
DAPI    rep3     2483           2040

[87 rows x 3 columns]
========
batch5
Folder structure is valid.
No bad files are found.
Total Sites:  54544
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      228            228
G3BP1   rep2      220            222
G3BP1   rep3      195            158
NONO    rep1      250            247
NONO    rep2      249            249
...      ...      ...            ...
TOMM20  rep2      167            243
TOMM20  rep3      184            228
DAPI    rep1     2678           2816
DAPI    rep2     2702           2883
DAPI    rep3     2645           2674

[87 rows x 3 columns]
========
batch6
Folder structure is valid.
No bad files are found.
Total Sites:  20979
df_reset (87, 3) colored_df (87, 3)
         Rep dNLS_DOX dNLS_Untreated
Marker                              
G3BP1   rep1      112             40
G3BP1   rep2       13            124
G3BP1   rep3      112             12
NONO    rep1      183            111
NONO    rep2      240            140
...      ...      ...            ...
TOMM20  rep2       97              3
TOMM20  rep3       57              8
DAPI    rep1     1212           1025
DAPI    rep2     1306            787
DAPI    rep3     1190            641

[87 rows x 3 columns]
========
====================

Difference between Raw and Processed¶

In [6]:
display_diff(batches, raws, procs, PLOT_PATH, fig_width=2,fig_height=12)
batch1
========
batch2
========
batch4
========
batch5
========
batch6
========

Variance in each batch (of processed files)¶

In [7]:
for batch in batches:
    with contextlib.redirect_stdout(io.StringIO()):
        var = sample_and_calc_variance(root_directory_proc, batch, 
                                       sample_size_per_markers=500, cond_count=2, rep_count=len(dnls_opera_reps), 
                                       num_markers=len(dnls_opera_markers))
    print(f'{batch} var: ',var)
batch1 var:  0.04806470200097039
batch2 var:  0.04792129771502062
batch4 var:  0.04648118313288534
batch5 var:  0.04547389232528459
batch6 var:  0.04670199332099523

Preprocessing Filtering qc¶

By order of filtering

1. % site survival after Brenner on DAPI channel¶

Percentage out of the total sites

In [8]:
dapi_filter_by_brenner = show_site_survival_dapi_brenner(df_dapi,batches, dnls_opera_line_colors, dnls_opera_panels, 
                                                        dnls_opera_reps, figsize=(3,5),vmax=250, 
                                                         to_ignore={'cell_line_cond':'WT Untreated','rep':'rep3'})

2. % Site survival after Cellpose¶

Percentage out of the sites that passed the previous filter. In parenthesis are absolute values.

A site will be filtered out if Cellpose found 0 cells in it.

In [9]:
dapi_filter_by_cellpose = show_site_survival_dapi_cellpose(df_dapi, batches, dapi_filter_by_brenner, dnls_opera_line_colors, 
                                                           dnls_opera_panels, dnls_opera_reps, figsize=(3,5),
                                                          to_ignore={'cell_line_cond':'WT Untreated','rep':'rep3'})

3. % Site survival by tiling¶

Percentage out of the sites that passed the previous filter. In parenthesis are absolute values.

A site will be filtered out if after tiling, no tile is containing at least one whole cell that Cellpose detected.

In [10]:
dapi_filter_by_tiling=show_site_survival_dapi_tiling(df_dapi, batches, dapi_filter_by_cellpose, dnls_opera_line_colors, dnls_opera_panels, 
                                                     dnls_opera_reps, figsize=(3,5),
                                                    to_ignore={'cell_line_cond':'WT Untreated','rep':'rep3'})

4. % Site survival after Brenner on target channel¶

Percentage out of the sites that passed the previous filter. In parenthesis are absolute values (if different than the percentages).

In [11]:
show_site_survival_target_brenner(df_dapi, 
                                  df_target, 
                                  dapi_filter_by_tiling, 
                                  markers=dnls_opera_markers,
                                  figsize=(3,12)
                                 )

# Nancy: there was a change Noam did here with "to_ignore" - removed it since function was missing. 

Statistics About the Processed Files¶

In [12]:
names = ['Total number of tiles', 'Total number of whole cells']
stats = ['n_valid_tiles','site_whole_cells_counts_sum','site_cell_count','site_cell_count_sum']
total_sum = calc_total_sums(df_target, df_dapi, stats, dnls_opera_markers)

Total tiles¶

In [13]:
# markers_for_dnls = markers.copy() #TODO need to change according to - if we use all markers or just the d8 ones!!!!
# markers_for_dnls.remove('TIA1')
# markers_for_dnls += ['TDP43B']

total_sum[total_sum.marker.isin(dnls_opera_markers)].n_valid_tiles.sum()
Out[13]:
1192604

Total whole nuclei in tiles¶

In [14]:
total_sum[total_sum.marker =='DAPI'].site_whole_cells_counts_sum.sum()
Out[14]:
485916.0

Total nuclei in sites¶

In [15]:
total_sum[total_sum.marker =='DAPI'].site_cell_count.sum()
Out[15]:
1742813.0
In [16]:
show_total_sum_tables(total_sum)
n_valid_tiles % valid tiles site_whole_cells_counts_sum site_cell_count
batch1
count 235.000000 235.000000 235.000000 235.000000
mean 719.621277 7.196213 837.914894 3014.646809
std 582.994661 5.829947 642.511899 2353.314551
min 0.000000 0.000000 5.000000 22.000000
25% 192.000000 1.920000 312.500000 1124.000000
50% 642.000000 6.420000 726.000000 2638.000000
75% 1034.000000 10.340000 1280.500000 4734.000000
max 2535.000000 25.350000 2417.000000 8549.000000
sum 169111.000000 NaN 196910.000000 708442.000000
expected_count 450.000000 450.000000 450.000000 450.000000
n_valid_tiles % valid tiles site_whole_cells_counts_sum site_cell_count
batch2
count 240.000000 240.000000 240.000000 240.000000
mean 678.354167 6.783542 875.033333 3177.304167
std 564.585444 5.645854 657.389339 2451.508347
min 2.000000 0.020000 2.000000 2.000000
25% 201.000000 2.010000 343.750000 1360.500000
50% 540.500000 5.405000 775.500000 2766.000000
75% 993.250000 9.932500 1346.500000 4784.000000
max 2190.000000 21.900000 2561.000000 9748.000000
sum 162805.000000 NaN 210008.000000 762553.000000
expected_count 450.000000 450.000000 450.000000 450.000000
n_valid_tiles % valid tiles site_whole_cells_counts_sum site_cell_count
batch4
count 240.000000 240.000000 240.000000 2.400000e+02
mean 1340.966667 13.409667 1703.458333 6.167167e+03
std 521.931590 5.219316 629.675348 2.316017e+03
min 82.000000 0.820000 90.000000 3.050000e+02
25% 930.000000 9.300000 1265.000000 4.398750e+03
50% 1458.500000 14.585000 1739.500000 6.525000e+03
75% 1708.000000 17.080000 2274.500000 8.248000e+03
max 2354.000000 23.540000 2818.000000 9.954000e+03
sum 321832.000000 NaN 408830.000000 1.480120e+06
expected_count 450.000000 450.000000 450.000000 4.500000e+02
n_valid_tiles % valid tiles site_whole_cells_counts_sum site_cell_count
batch5
count 240.000000 240.000000 240.000000 2.400000e+02
mean 1656.920833 16.569208 2294.004167 8.277175e+03
std 484.480058 4.844801 433.530466 1.517752e+03
min 392.000000 3.920000 1238.000000 4.382000e+03
25% 1476.750000 14.767500 2038.500000 7.319000e+03
50% 1774.500000 17.745000 2380.500000 8.620000e+03
75% 1957.000000 19.570000 2592.000000 9.322000e+03
max 2491.000000 24.910000 3248.000000 1.161400e+04
sum 397661.000000 NaN 550561.000000 1.986522e+06
expected_count 450.000000 450.000000 450.000000 4.500000e+02
n_valid_tiles % valid tiles site_whole_cells_counts_sum site_cell_count
batch6
count 236.000000 236.000000 236.000000 236.000000
mean 598.283898 5.982839 956.826271 3301.411017
std 668.185839 6.681858 978.171947 3431.604088
min 0.000000 0.000000 1.000000 3.000000
25% 33.000000 0.330000 52.000000 185.000000
50% 290.000000 2.900000 531.000000 1797.000000
75% 1037.000000 10.370000 1571.000000 5338.000000
max 2492.000000 24.920000 3386.000000 12045.000000
sum 141195.000000 NaN 225811.000000 779133.000000
expected_count 450.000000 450.000000 450.000000 450.000000
n valid tiles % valid tiles site_whole_cells_counts_sum site_cell_count
All batches
count 1.191000e+03 1191.000000 1.191000e+03 1.191000e+03
mean 1.001347e+03 10.013468 1.336793e+03 4.799975e+03
std 7.069285e+02 7.069285 8.983272e+02 3.253122e+03
min 0.000000e+00 0.000000 1.000000e+00 2.000000e+00
25% 3.430000e+02 3.430000 5.675000e+02 1.983000e+03
50% 9.530000e+02 9.530000 1.345000e+03 4.725000e+03
75% 1.616500e+03 16.165000 2.141000e+03 7.796000e+03
max 2.535000e+03 25.350000 3.386000e+03 1.204500e+04
sum 1.192604e+06 NaN 1.592120e+06 5.716770e+06
expected_count 4.500000e+02 450.000000 4.500000e+02 4.500000e+02

Show Total Tile Counts¶

For each batch, cell line, replicate and marker: Total number of tiles

First, we look at all cell lines togther:¶

In [17]:
show_total_valid_tiles_per_marker_and_batch(total_sum, vmax=1000)

Separating into cell lines & batches:¶

In [18]:
to_heatmap = total_sum.rename(columns={'n_valid_tiles':'index'})
plot_filtering_heatmap(to_heatmap, 
                       extra_index='marker', 
                       vmin=None, vmax=None,
                       xlabel = 'Total number of tiles', 
                       show_sum=True, 
                       figsize=(7,28), 
                       fmt=".0f")
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)

Show Total Whole Cell Counts¶

For each batch, cell line, replicate and markerTotal number of tiles

In [19]:
to_heatmap = total_sum.rename(columns={'site_whole_cells_counts_sum':'index'})
plot_filtering_heatmap(to_heatmap, 
                       extra_index='marker', 
                       vmin=None, vmax=None,
                       xlabel = 'Total number of whole cells', 
                       show_sum=True, 
                       figsize=(7,28), 
                       fmt=".0f")
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:394: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), fontsize=6)

Show Cell Count Statistics per Batch¶

In [20]:
df_no_empty_sites = df_dapi[df_dapi.n_valid_tiles !=0]
plot_cell_count(df_no_empty_sites, dnls_opera_lines_order, dnls_opera_custom_palette, y='site_cell_count_sum', 
                title='Cell Count Average per Site (from tiles)')

plot_cell_count(df_no_empty_sites, dnls_opera_lines_order, dnls_opera_custom_palette, y='site_whole_cells_counts_sum',
                title='Whole Cell Count Average per Site')

plot_cell_count(df_no_empty_sites, dnls_opera_lines_order, dnls_opera_custom_palette, y='site_cell_count',
               title='Cellpose Cell Count Average per Site')

Show Tiles per Site Statistics¶

In [21]:
df_dapi.groupby(['cell_line_cond']).n_valid_tiles.mean()
Out[21]:
cell_line_cond
dNLS DOX          5.473429
dNLS Untreated    4.293922
Name: n_valid_tiles, dtype: float64
In [22]:
df_dapi[['site_cell_count']].mean()
Out[22]:
site_cell_count    23.606076
dtype: float64
In [23]:
plot_catplot(df_dapi, custom_palette,dnls_opera_reps, x='n_valid_tiles', x_title='valid tiles count', batch_min=1, batch_max=6, height=6)
/home/projects/hornsteinlab/Collaboration/NOVA/tools/preprocessing_tools/qc_reports/qc_utils.py:1063: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.loc[:, 'batch_rep'] = df['batch'] + " " + df['rep']

Show Mean of cell count in valid tiles¶

In [24]:
plot_hm_of_mean_cell_count_per_tile(df_dapi, split_by='rep', rows='cell_line_cond', columns='panel', figsize=(14,3))

Assessing Staining Reproducibility and Outliers¶

In [25]:
# for batch in batches:
#     print(batch)
#     run_calc_hist_new(f'{batch}', dnls_opera_cell_lines_for_disp, dnls_opera_markers,
#                       root_directory_raw, root_directory_proc,
#                            hist_sample=10,sample_size_per_markers=200, ncols=8, nrows=4, dnls=True)
#     print("="*30)
In [26]:
# save notebook as HTML ( the HTML will be saved in the same folder the original script is)
# from IPython.display import display, Javascript
# display(Javascript('IPython.notebook.save_checkpoint();'))
# os.system(f'jupyter nbconvert --to html {NOVA_HOME}/tools/preprocessing_tools/qc_reports/qc_report_dNLS_Opera.ipynb --output {NOVA_HOME}/manuscript/preprocessing_qc_reports/ManuscriptFinalData/qc_report_dNLS_Opera.html')
In [ ]: